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INTRODUCTION 


This  report  summarizes  project  activities  taking  place  since  the 
date  of  the  last  annual  report,  June  30,  1982.  That  report  summarized  our 
activities  directed  to  the  development  of  an  integrated  optical  spatial  light 
modulator  and  to  the  successful  demonstration  of  its  utility  in  a  digital  opti¬ 
cal  correlator,  and  marked  the  point  of  redirection  of  the  project  efforts  to¬ 
wards  numerical  optical  processing.  The  specific  aims  of  the  current  phase  of 
the  work  are  recounted  below  and  the  present  status  of  the  research  is  briefly 
described.  Activities  related  to  the  dissemination  of  project-developed  infor¬ 
mation  will  also  be  described.  The  brevity  of  this  report  is  directly  related 
to  the  realities  of  redirection  of  effort. 

AIMS  OF  THE  PRESENT  PHASE 

The  present  goal  of  the  project  is  the  demonstration  of  an  integrated 
optical  numerical  (analog)  processor  for  matrix-vector  multiplication,  utilizing 
a  computational  architecture  known  as  an  engagement  processor.  This  architec¬ 
ture  is  described  in  detail  in  our  proposal  dated  August  13,  1982.  Briefly 
described,  it  is  a  variation  of  the  systolic  architecture  proposed  by  Kung^ 
for  use  in  VLSI  electronics.  In  this  architecture,  the  data  flow  in  a  highly 
synchronized  fashion  through  a  region  containing  processing  units  of  a  simple 
kind,  in  a  pulsing  manner  (hence  the  term  "systolic").  With  each  pulse,  or 
epoch  of  time,  data  is  input  to  a  particular  processor,  utilized,  then  passed 
to  an  adjacent  processor  for  its  use  in  the  next  time  epoch.  A  datum  is 
called  from  memory  only  at  the  boundaries  of  the  processing  region  and,  once 
called,  is  passed  from  one  processing  unit  to  the  next  in  a  regular  way.  This 
avoids  the  repeated  references  to  memory  that  have  been  labeled  the  "Von 
Neumann  bottleneck",  and  enables  full  utilization  of  the  potential  speed  advan¬ 
tages  of  parallel  processing  using  processing  arrays.  It  turns  out  to  be  par¬ 
ticularly  well  suited  to  integrated  optics. 

The  processor  proposed  to  be  developed  would  multiply  a  16  x  16 
matrix  by  a  16-element  vector,  using  the  methods  developed  earlier  in  the  pro¬ 
gram  for  the  correlator,  that  is,  using  arrays  of  electrooptical ly  actuated 
Bragg  gratings.  The  maximum  speed  attainable  with  such  a  device  is  determined 
by  a  variety  of  factors;  for  the  demonstration  device,  however,  the  speed  will 


be  limited  by  the  electronics  used  to  insert  the  data  into  the  integrated  opti 
cal  circuit  (IOC).  The  ultimate  limit  would  be  placed  by  the  capacitance  of 
the  electrodes,  about  10  pf,  at  about  1  Gb/sec.  data  rate. 

There  are  a  number  of  difficult  problems  associated  with  the  develop 
ment  of  this  processor.  First,  the  fabrication  of  a  photolithographic  mask 
containing  features  2  mm  long  x  3.4  urn  wide  in  two  arrays  of  32  sets  of  8 
finger  pairs  (a  total  of  1024  lines  of  aspect  ratio  588)  is  a  challenge  to  any 
maskmaker.  In  the  present  case,  the  two  arrays  must  be  precisely  aligned  rela 
tive  to  one  another,  addinq  to  the  difficulty.  Second,  the  drive  electronics 
for  such  a  processor  are  far  from  trivial.  Indeed,  it  was  decided  to  make  the 
demonstration  using  matrices  having  constant  values  across  a  row  to  avoid  the 
need  to  develop  and  assemble  a  fully  operational,  high  speed  electronics  net. 
Such  a  development  would  detract  from  the  novel,  optical  aspects  of  the  pro¬ 
gram,  but  would  not  add  to  the  utility  of  the  demonstration.  Third,  the  con¬ 
nection  of  the  IOC  elements  to  the  external  signal  sources  requires  consider¬ 
able  care.  Finally,  our  experience  with  the  earlier  program  indicates  that 
we  must  understand  the  operation  of  high-aspect-ratio  Bragg  gratings  in  order 
to  be  able  to  assess  the  performance  of  the  device  and  to  be  able  to  indicate 
directions  for  improvements.  The  simple  model  used  with  the  correlator  is 
not  completely  adequate,  so  some  attention  will  be  given  to  finding  an 
improved  theory  predicting  response  and  crosstalk. 

PRESENT  STATUS 

A  photolithographic  mask  has  been  designed  and  sent  to  our  maskmaker 
for  fabrication.  As  was  anticipated,  they  have  experienced  some  difficulties 
with  the  fabrication  that  have  led  to  project  delays.  The  mask  is  expected  to 
be  delivered  soon.  It  is  sketched  in  a  highly  schematic  form  in  Fig.  1. 

A  second  mask,  to  be  used  to  fabricate  a  layer  for  connection  to 
external  devices,  has  been  ordered  and  delivered,  It  is  pictured  in  Fig.  2. 
This  circuit  will  be  fabricated  on  a  ceramic  substrate  with  a  hole  opened  at 
the  rectangle  indicated  in  the  photograph.  The  IOC  will  reside  in  this  hole, 
with  leads  bonded  to  the  IOC  and  to  the  leadout  electrodes. 

The  drive  electronics  to  input  data  into  the  circuit  has  been 
designed  and  is  presently  being  assembled. 


Figure  1.  Highly  schematic  drawing  of  the  "herringbone" 
electrode  pattern  for  the  matrix-vector  multi¬ 
plier.  The  actual  mask  will  contain  many  such 
segments  rather  than  the  three  shown  here. 


Figure  2.  Photograph  of  mask  for  electrical  connection  to  IOC. 

The  actual  IOC  will  be  located  in  an  opening  indi¬ 
cated  here  by  the  central  rectangle. 


A  number  of  publications  relating  to  the  optical  response  of  Bragg 
gratings  have  been  assembled  and  studied.  There  appear  to  be  two  approaches 
to  developing  a  response  theory.  In  one,  an  exact  solution  to  the  diffraction 
problem  is  used.  The  difficulty  here  is  that  the  exact  solution  is  extremely 
complicated,  and  will  require  careful  computer  implementation  to  be  useful.  In 
the  other,  a  numerical  approach  is  used  from  the  start  to  solve,  approximately, 
the  diffraction  problem.  Both  approaches  will  require  some  coding.  Present 
inclination  is  to  use  the  second  approach,  if  the  coding  needed  is  not  too 
extensive,  because  it  would  result  in  a  more  generally  useful  tool  for  dif¬ 
fraction  analysis. 

OTHER  ACTIVITIES 

A  publication  entitled  "Design  and  Performance  of  an  Integrated- 
Optical  Digital  Correlator"  has  been  prepared,  submitted,  and  accepted  for 
publication  by  the  IEEE  Journal  of  Lightwave  Technology.  It  is  included  as 
Appendix  A. 

A  paper  entitled  "Integrated-Optical  Circuits  for  Numerical  Compu¬ 
tation"  was  presented  at  the  SPIE  Technical  Symposium  East  '83,  Washington, 
D.C.,  April  5-7,  1983.  It  is  included  as  Appendix  B. 

A  Gordon  conference  on  Holography  and  Optical  Information  Processing, 
held  June  21-25,  1982,  was  attended  by  Dr.  Verber,  and  results  of  the  project 
were  discussed  there.  As  a  direct  result,  he  has  been  instrumental  in  guiding 
the  thrust  of  the  numerical  optical  computation  community;  has  presented  a 
paper  at  the  International  Optical  Computation  Conference  in  Boston,  MA, 

April  6,  1983;  and  has  participated  as  discussion  leader  in  a  workshop  on 
numerical  optical  computation  held  recently  in  Atlanta. 

Reference 

1.  H.  T.  Kung,  "Why  Systolic  Architectures?",  Computer  1_5,  37  (1982). 
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DESIGN  AND  PERFORMANCE  OF  AN  INTEGRATED* 

OPTICAL  DIGITAL  CORRELATOR 

C.  M.  Verber,  R.  P.  Kenan  and  J.  R.  Busch 

Battelle  Columbus  Laboratories 
505  King  Avenue 
Columbus,  Ohio  43201 

Abstract 

We  describe  an  integrated  optical  correlator  capable  of  perform¬ 
ing  ordinary  binary  or  bipolar  correlations.  The  device  consists  of  two 
SAW  transducers  and  an  electrooptic  spatial  light  modulator  in  a  planar 
Ti  in-diffused  LiNbO^  waveguide.  It  is  designed  to  correlate  a  32-bit 
word  at  a  32  M-bit/sec  data  rate. 


*Work  supported  by  Air  Force  Office  of  Scientific  Research. 
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DESIGN  AND  PERFORMANCE  OE  AN  INTEGRATED 
OPTICAL  DIGITAL  CORRELATOR 


Introduc  tion 

In  a  previous  publication^"*  we  discussed  the  operation  of  an 
integrated  optical  correlator  whose  active  components  are  a  programmable 
electrooptic  integrated  optical  spatial  light  modulator  (10SLM)  and  a  digit¬ 
ally  modulated  surface  acoustic  wave  (SAW)  transducer.  The  electrooptically- 
induced  phase  grating  and  the  SAW  act  upon  a  guided  optical  wave  to  produce 
a  time -varying  optical  signal  which  is  proportional  to  the  cross-correlation 
of  the  digital  word  preprogrammed  in  the  IOSLM  with  the  bit-stream  generated 
by  the  SAW  transducer.  Although  this  correlator  produces  the  desired  correla¬ 
tion  signal,  it  suffers  from  a  design  flaw  which  causes  the  correlation  signal 
to  appear  on  a  background  of  "noise"  whose  height  is  proportional  to  the  num¬ 
ber  of  "ones"  in  that  part  of  the  reference  word  (in  the  IOSLM)  that  has  no 
overlap  with  the  signal  word  (carried  by  the  SAW).  In  the  present  paper  we 
discuss  a  modified  correlator  in  which  this  design  flaw  is  eliminated  and 
significant  flexibility  in  usage  is  introduced.  This  new  correlator  design 
employs  the  same  type  of  electrooptic  IOSLM  as  previously  discussed.  However, 
ic  employs  two  SAW  transducers  to  generate  acoustic-wave  grating  segments  of 
two  different  frequencies;  each  frequency  represents  one  of  the  two  levels 
in  the  binary  signal  word.  (We  refer  to  these  levels  as  the  "zero"  level 
and  the  "one"  level,  corresponding  to  the  ordinary  binary  notation  using 
0  and  1.  We  are,  however,  free  to  choose  whatever  arithmetic  we  find  con¬ 
venient;  an  example  of  this  is  the  bipolar  choice  using  -1  and  +1,  which 
will  be  seen  to  be  very  convenient.)  This  two-SAW  design  effects  an  angular 
separation  among  the  two  desired  output  beams  and  undiffracted  light  which 
eliminates  the  flaw  present  in  the  earlier  device. 

Correlator  Design  and  Operation 

The  operation  of  the  correlator  may  be  understood  with  reference 
to  Figure  1  which,  for  simplicity,  is  a  schematic  ol  a  4-bit  correlator.  The 
device  is  fabricated  on  a  planar  Ti- ind i f f used  LiNbO^  waveguide.  The  active 
components  are  two  SAW  transducers  operating  at  459  MHz  (level  zero)  and  875 
MHz  (level  one)  respectively,  and  an  electrooptic  IOSLM.  These  components 
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are  the  active  elements  of  the  correlator.  The  IOSLM  is  discussed  in  detail 
in  Ref.  1.  In  the  figure,  the  angles  between  the  axes  of  the  three  trans¬ 
ducers  are  greatly  exaggerated.  The  angles  are  chosen  so  that  light  cannot 
be  diffracted  by  the  IOSLM  unless  it  has  first  been  diffracted  by  one  of  the 
SAWs.  The  geometry  is  such  that  light  diffracted  by  the  low-frequency  (level 
zero)  SAW  enters  the  IOSLM  at  its  lower  Bragg  angle  and  light  diffracted  by 
the  high-frequency  (level  one)  SAW  hits  the  IOSLM  at  its  upper  Bragg  angle. 
If  level  one  (zero)  is  represented  in  the  IOSLM  by  an  electrooptic  grating 
segment  which  is  turned  on  (off)  then  the  geometry  has  the  following  con¬ 
sequences  : 


(i)  Light  diffracted  by  the  low-frequency  SAW  (0)  and  passing 
through  an  unenergized  IOSLM  element  (0)  emerges  from  the  interaction  region 
in  the  same  direction  as  light  diffracted  by  the  high  frequency  SAW  (1)  and 
an  energized  IOSLM  element  (1).  We  refer  to  this  (0-0)  and  (1-1)  direction 
as  the  "coincidence"  or  "+1"  direction. 

(ii)  In  a  similar  fashion  it  can  be  seen  that  light  resulting  from 
successive  (0-1)  or  (1-0)  coincidences  emerge  in  a  second  direction.  We 
refer  to  this  as  the  "anticoincidence"  or  "-1"  direction. 

(iii)  Light  which, due  to  less  than  100%  SAW  diffraction  efficiency 
or  due  to  the  absence  of  SAW  pulses,  passes  undeflected  through  the  SAW 
region  emerges  from  the  interaction  region  in  a  direction  which  is  different 
from  either  the  coincidence  or  anticoincidence  direction  and  therefore  does 
not  contribute  to  background  noise. 

The  diagram  in  Figure  2  illustrates  the  angular  relationships  in 
the  two-SAW  correlator.  The  center  lines  of  the  SAWs  and  of  the  IOSLM  are 
illustrated.  It  is  important  that  a  ray  that  addresses  the  device  as  shown 
should  encounter  the  low-frequency  SAW  before  it  encounters  the  high-frequency 
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The  angles  indicated  in  Fig.  2  are  not  exact  but  are  nonetheless  quite  precise 
approximations.  The  location  of  the  crossing  point  of  the  two  SAWs  and  the 
IOSLM  is  not  fixed  by  any  specific  criterion  bu  is  chosen  to  minimize  fabri¬ 
cation  difficulties  and  to  optimize  the  synchronization  of  the  outputs  of 
the  two  SAW  transducers.  In  general,  it  would  be  necessary  to  optimize  the 
synchronization  of  the  SAWs  since  the  two  surface  acoustic  waves  travel  in 
distinct  directions  relative  to  the  incident  light  beam.  This  results  in  a 
gradual  degradation  of  the  interleaving  of  the  two  bit-streams  corresponding 
to  the  zero  and  one  levels  of  the  data  stream.  However,  because  of  the 
small  angles  involved,  this  effect  is  very  small.  For  the  present  device, 
the  misalignment  over  the  width  of  the  optical  beam  amounts  to  less  than  1% 
of  the  length  of  one  bit.  However,  even  this  small  effect  can  be  minimized 
by  choosing  the  location  of  the  SAW  transducers  so  that  the  two  bit  streams 
are  precisely  synchronized  at  the  center  of  the  optical  beam. 

To  arrive  at  a  specific  correlator  design  within  the  geometric 
constraints  discussed  above  it  is  necessary  only  to  specify  a  data  rate  and 
to  adapt  procedures  which  minimize  fabrication  problems.  A  data  rate  of 
32  Mbit/sec  was  chosen.  This,  when  combined  with  the  SAW  velocity  determines 
the  size  of  the  IOSLM  elements.  Photolithographic  resolution  limits  of  1  pm 
indicate  that  the  high  frequency  SAW  period  should  not  be  less  than  A  pm. 
Finally,  we  note  that  the  acoustic  waves  undergo  a  dif fractional  spread  by 
an  amount  determined  by  the  acoustic  aperture  of  the  transducers,  according 
to  the  usual  formula: 


a  =  a/w 

a 


(1) 


where  a  is  the  spread  angle,  A  the  acoustic  wavelength  and  W  is  the  acoustic 

3 

aperture.  It  is  desirable,  but  not  necessary,  to  choose  the  ratio  of  the  two 
acoustic  apertures  so  that  the  two  SAWs  undergo  the  same  angular  spread,  so 
that  whatever  di f fractional  degradation  occurs  will  occur  to  both  SAWs  equally. 
Thus,  we  set 


Ax/W 


al 


(2) 


These  considerations  result  in  the  device  parameters  displayed  in  Table  1. 
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TABLE  I.  32  MB/Sec  CORRELATOR  DESIGN 


Ltem,  Symbol  (Units) 


Values 


Low-Frequency  Transducer 

Frequency, 

Period,  A, 

Aperture,  W  ^ 

il  Finger  Pairs 

Bragg  Angle  @.633  pm,  0 


459  MHz 
7.625  pm 
1.0  mm 
4 

0.0189  rad 


High-Frequency  Transducer 

Frequency, 

Period, 

Aperture,  W  „ 
a2 

§  Finger  Pairs 
Bragg  Angle  @.633  pm, 


875  MHz 
4.0  pm 
0.50  mm 
8 

0.0359  rad 


Period,  A 

eo 

Depth,  d 
No.  Periods/bit 
Bit  length 
Bragg  Angle,  0 


Electrooptic  Grating 


8.413  pm 
2.0  mm 
13.0 

109.4  pm 


0.0171  rad 
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Binary  vs.  Bipolar  Encoding 

The  input  to  the  correlator  is  a  32  Mbit/sec  stream  which  is  used 
to  modulate  the  r.f.  inputs  to  one  or  both  of  the  SAW  transducers.  If  only 
one  transducer  is  used,  then  the  output  of  the  device  will  be  an  electrical 
signal  that  is  proportional  to  the  number  of  coincidences  of  "on"  bits  be¬ 
tween  the  preset  reference  word  and  every  successive  32-bit  sequence  in  the 
input  data  stream;  the  other  output  will  be  proportional  to  the  number  of 
coincidences  of  an  "on"  bit  in  the  input  stream  with  an  "off"  bit  in  the 
preset  reference  word.  This  corresponds  to  ordinary  binary  multiplication. 

If  both  SAWs  are  used,  then  one  output  is  proportional  to  the  sum 
of  the  number  of  coincidences  of  "on"  bits  and  the  number  of  coincidences  of 
"off"  bits;  the  other  output  will  be  proportional  to  the  sum  of  the  number 
of  coincidences  of  "on"  bits  in  the  input  stream  with  "off"  bits  in  the 
reference  word  and  the  number  of  coincidences  of  "off"  bits  in  the  input 
stream  with  "on"  bits  in  the  reference  word.  This  corresponds  to  a  bipolar 
encoding,  where,  for  example,  an  "on"  bit  represents  a  +1  and  an  "off"  bit 
represents  a  -1.  These  output  directions  will  be  referred  to  as  the  "+1" 
and  the  "-1"  outputs,  respectively.  The  true  bipolar  correlation  is  formed 
by  subtracting  the  "-1"  output  from  the  "+1"  output;  this  may  be  done  elec¬ 
tronically  using  a  differential  amplifier. 

When  the  input  stream  consists  of  a  32-bit  sequence  alone,  that  is, 
with  no  SAW  signals  on  either  side,  then  one  output  of  the  device  using  the 
binary  arithmetic  is  a  time  sequence  corresponding  to  the  correlation  of  the 
input  sequence  with  the  reference  word  in  the  IOSLM,  while  the  other  output 
is  the  correlation  of  the  input  sequence  with  the  complement  of  the  refer¬ 
ence  word.  If  the  bipolar  arithmetic  is  used,  i.e.,  both  SAW  streams  are 
present,  then  the  correlation  of  the  input  word  with  the  reference  word  will 
be  the  instant-by-instant  difference  between  the  "+1"  output  and  the  "-1" 
output.  The  importance  of  the  separation  of  the  two  bipolar  outputs  into 
two  directions  can  now  be  seen.  The  desired  outputs  are  proportional  to 
the  light  intensity,  an  intrinsically  positive  quantity.  We  get  around  this 
inconvenience  by  collecting  positive  results  from  one  direction  and  negative 
results  from  another  (both  being  positive  intensities),  then  forming  the 
difference  electronically. 
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When  the  input  stream  is  a  continuous  stream  of  data,  then  at  each 
alignment  of  a  32-bit  portion  of  the  input  stream  with  the  IOSLM  the  outputs 
will  correspond  to  a  comparison  of  the  aligned  sequences.  With  one  SAW  stream, 
one  output  will  be  proportional  to  the  number  of  SAW  segments  aligned  with 
an  activated  IOSLM  segment  and  the  other  will  be  proportional  to  the  number 
of  SAW  segments  aligned  with  an  unactivated  IOSLM  segment.  A  more  useful 
comparison  occurs  when  the  bipolar  scheme  is  used.  The  "-1"  output  now 
corresponds  to  the  logical  "exclusive  or"  (EXOR)  operation  between  the  two 
words;  so,  when  the  words  are  identical,  this  output  will  be  null.  This 
can  be  a  very  useful  output  for  recognition  of  data  sequences.  The  "+1" 
output  is  the  complement  of  the  "-1"  output  and  therefore  is  maximum  when 
the  two  words  are  identical.  In  noisy  situations,  the  "+1"  output  will  be 
more  useful  than  the  "-1"  output  for  recognition. 

Experimental  Results 

The  correlator  is  fabricated  upon  a  planar  indiffused  single-mode 
LiNbO^:Ti  waveguide.  The  two  SAW  transducers  and  the  IOSLM  are  fabricated 
in  a  single  photolithographic  step  using  hard-contact  techniques.  In  the 
current  version  of  the  device  prism  couplers  are  used,  and  the  source  (a 
HeNe  laser),  the  detector,  and  their  associated  optics  are  bulk-optical 
components.  Hybrid  integration  of  source  and  detector  are, of  course,  well 
within  current  state-of-the-art. 

The  experimental  arrangement  used  to  exercise  the  correlator  is 
shown  in  Figure  3.  The  heart  of  this  arrangement  is  an  HP  Model  8018A 
50  MHz  Serial  Data/PRBS  Generator.  The  8018A  has  2  NRZ  digital  outputs. 

"A"  produces  positive  pulses  which  correspond  to  level  one  in  the  output 
word  and  "A"  produces  positive  pulses  corresponding  to  the  zero  level  in 
the  output  word.  Thus,  the  A  output  can  be  used  to  modulate  one  of  the  SAW 
transducers  and  the  A  output  can  be  used  to  modulate  the  other. 

To  exercise  the  correlator  a  preset  word  is  read  out  of  the  word 
generator  into  the  data  register.  The  data  register  latches  and  applies 
appropriate  voltages  to  the  32  elements  of  the  IOSLM.  The  digital  word 
generator  is  then  used  to  regenerate  this  word  with  the  output  now  being 
used  to  control  the  modulation  of  the  SAW  transducers.  The  digital  word 
generator  also  has  the  ability  to  bury  the  preset  word  in  a  pseudorandom 
bit  sequence  (PRBS) .  This  feature  is  useful  in  determining  the  ability  of 
the  correlator  to  distinguish  the  preprogrammed  word  from  a  series  of  random 
background  words. 


In  Figure  4  we  show  the  calculated  coincidence  output  for  the  auto¬ 
correlation  of  the  digital  word  displayed  in  the  lower  oscilloscope  trace. 

In  the  upper  trace  is  the  output  of  detector  A  when  this  word  is  applied  via 
the  digital  word  generator  to  the  SAW  transducers.  As  can  be  seen  the  experi 
mental  result  and  the  calculated  output  function  are  quite  similar.  In 
Figure  5  we  display  the  computed  and  observed  output  of  detector  B.  The 
expected  output  is  computed  by  summing  the  anticoincidences  as  the  signal 
word  moves  across  the  reference  word.  Once  again  the  experimental  and  com¬ 
puted  outputs  are  quite  similar.  It  should  be  noted  that  the  instant  that 
the  coincidence  output  reaches  its  maximum  the  "anti-ccincidence"  output  is 
zero . 

The  worst  case  encountered  when  attempting  to  perform  the  recogni¬ 
tion  function  is  discrimination  against  a  signal  word  which  differs  from  the 
reference  word  in  only  one  bit.  If  the  number  of  "one"  or  "on"  bits  in  the 
reference  word  is  ,  then  the  height  of  the  autocorrelation  peak  is  pro¬ 
portional  to  when  the  ordinary  binary  arithmetic  is  used,  and  to  N  (the 
total  number  of  bits  in  the  word)  when  the  bipolar  arithmetic  is  used.  In 
the  binary  case,  if  we  change  a  zero  to  a  one  In  the  signal  word,  then  the 
correlation  peak  is  unchanged  in  height,  while  changing  a  one  to  a  zero  pro¬ 
duces  a  peak  proportional  to  N^-l.  In  the  bipolar  case,  either  change  causes 
reduction  in  peak  height  to  N-2.  It  is  evident,  therefore,  that  the  bipolar 
arithmetic  is  superior  for  recognition  applications.  It  might  be  noted  that 
the  binary  arithmetic  gives  the  same  correlation  peak  height  for  a  word  of 

all  ones  as  for  the  reference  word. 

Another  parameter  of  the  correlator  which  is  of  interest  is  its 
response  time.  A  simple  test  which  revealed  some  problems  in  this  area  was 
to  look  at  the  autocorrelation  of  a  simple  "picket  fence"  test  word  (alter¬ 
nating  ones  and  zeros)  which  was  input  into  the  correlator  at  successively 
higher  frequencies.  The  ideal  output  for  both  detectors  A  and  B  is  a  series 
of  triangular  waves  of  first  increasing  and  then  decreasing  amplitudes.  For 
both  detectors  we  would  expect  the  minima  to  be  zero  for  each  of  the  tri¬ 
angular  waves.  As  can  be  seen  in  Figure  7  the  desired  behavior  is  almost 
achieved  for  150  nsec  input.  However,  as  the  bit  duration  is  decreased 
there  is  a  significant  degradation  in  the  output.  A  number  of  factors  com¬ 
bine  to  produce  this  degradation  at  high  data  rates:  diffraction  spread 
from  the  single  bits  of  the  SAWs  and  the  IOSLM,  which  present  optical 
apertures  of  109  pm;  the  response  time  of  the  photodetector  and  associated 
electronics;  geometrical  effects;  and  less  than  optimal  diffraction 
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efficiency  of  the  IOSLM.  Examination  of  the  experimental  results  and  analysis 
of  the  relationships  among  the  SAW  and  the  IOSLM  grating  segments  suggest  that 
the  dominant  effects  are  the  reduced  effective  diffraction  efficiency  of  an 
isolated,  activated  IOSLM  segment  and  the  reduced  effective  width  of  an 
unactivated  segment  surrounded  by  activated  ones.  Both  of  these  are  geo¬ 
metric  effects.  The  first  arises  because  light  enters  and  leaves  the  segments 
along  its  sides  as  well  as  across  its  faces;  so  some  rays  do  not  experience 
the  entire  depth  of  the  segment  and  passes  through  the  segment  and  into  the 
wrong  output  beam.  The  second  effect  is  due  to  the  smaller  clear  aperture 
of  an  unactivated  segment  when  it  is  viewed  from  Bragg  incidence  because  of 
the  high  aspect  ratio  (ratio  of  grating  depth  to  segment  height)  involved. 

The  result  is  that  some  of  the  light  that  should  have  passed  through  an 
unactivated  segment  encounters  adjacent,  activated  segments  and  is  deflected 
into  the  wrong  beam.  In  this  sense,  the  "picket"  fence  is  the  worst  possible 
case  since  every  inactivated  segment  (a  zero)  has  activated  neighbors.  The 
use  of  equal  efficiency  SAW  transducers  and  the  optimization  of  the  acousto¬ 
optic  overlap  integral  by  improved  waveguide  design  is  expected  to  result 
in  significant  improvement  in  the  frequency  response  of  the  correlator. 

Discussion 

We  have  shown  that  the  device  described  above  is  indeed  capable  of 
producing  the  correlation  of  one  binary  (two-level)  data  word  with  another 
at  the  data  rate  of  32  Mbits/sec.  The  device  has  the  feature  that  either 
ordinary  binary  (0/1)  or  bipolar  (+1/-1)  arithmetic  can  be  accommodated.  Be¬ 
cause  of  the  encoding  of  the  outputs  in  light  intensities,  negative  quanti¬ 
ties  cannot  be  directly  encoded,  so  the  "-1"  contributions  to  the  correlations 
are  separately  calculated  and  subtracted  after  detection  whenever  the  bipolar 
arithmetic  is  used. 

The  possibility  of  using  the  device  in  a  data-recognition  (data- 
retrieval)  system  was  mentioned  earlier.  Clearly,  so  long  as  the  noise  level 
is  low,  a  scheme  using  the  "-1"  output  direction  to  detect  a  match  has  a 
signal-to-noise  advantage  over  the  use  of  the  "+1"  output  because  the  output 
is  null  (ideally)  whenever  a  match  occurs;  reliable  recognition  depends  only 
on  one's  ability  to  discriminate  the  output  from  one  or  more  segments  against 
the  low  noise  level.  In  systems  where  high  noise  levels  may  occur,  the  "+1" 
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output  or  the  correlation  ("+1"  output  minus  the  "-1"  output)  have  the  advan¬ 
tage  because  the  high  peak  for  identification  can  be  discriminated  against 
the  noise.  Reliable  recognition  now  depends  on  the  ability  to  discriminate 
a  peak  32  units  high  against  one  31  units  high  (32  against  30  for  the  correla¬ 
tion).  Clearly,  shorter  words  are  more  reliably  recognized  than  longer  ones. 

If  long  data  sequences  need  to  be  recognized  against  noise,  then  some  recur¬ 
sive  scheme  using  shorter  words  and  recognizine  the  occurrence  of  the  desired 

(2) 

data  in  segments  might  be  useful. 

Finally,  we  wish  to  point  out  that  the  present  device  is  the  first 

example  of  an  integrated  optical  systolic  processor.  According  to  Kung's 
(3) 

terminology,  1  the  device  would  be  termed  a  "Design  F"  convolution  array 
having  local  weights  (the  reference  word),  moving  input  variables  (the  SAW- 
encoded  stream),  and  a  "fan  in"  of  results  (the  lens  that  spatially  inte¬ 
grates  the  light  emerging  from  the  IOSLM) .  The  IOSLM  that  is  the  heart  of 
this  device  has,  however,  a  number  of  other  applications  in  analog  optical 
numerical  computation  using  systolic  and  related  architectures.  An  example 
of  such  a  device  is  a  matrix-vector  multiplier  in  which  the  matrix  data  and 

the  vector  data  move  at  right  angles  to  one  another  in  a  variation  on  the 

(4) 

systolic  type  of  architecture.  Investigation  into  processing  architectures 

for  implementing  a  variety  of  matrix  operations  are  under  way.  Such  integrated- 
optical  processing  systems  appear  to  offer  an  attractive  solution  to  several 
special-purpose  computational  problems  where  extremely  high  speed  is  required, 
but  analog  accuracy  is  acceptable. 
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FIGURE  CAPTIONS 

Figure  1.  Schematic  of  the  correlator  showing  the  angular  separation  of 
the  "coincidence",  the  "anticoincidence",  and  the  undiffracted  beams. 

Figure  2.  Layout  geometry  for  the  2-SAW  correlator,  showing  angular  rela¬ 
tionships.  The  IOSLM  is  placed  along  the  caustic  generated  by  the  two  SAW's. 
Angles  are  given  for  the  small-angle  approximation;  only  the  SAW  and  IOSLM 
axes  are  shown. 

Figure  3.  Experimental  arrangement  used  to  exercise  the  correlator.  Angular 
separation  of  the  "coincidence",  "anticoincidence"  and  zero-order  beams  is 
shown. 

Figure  4.  Calculate  autocorrelation  of  the  32  bit  digital  word.  The  experi¬ 
mental  correlation  is  shown  in  the  insert  along  with  the  input  signal. 

Figure  5.  Anticoincidence  output  corresponding  to  Fig.  4. 

Figure  6.  Autocorrelation  of  the  32-bit  word  00100111110000111110011000001111 
showing  a)  the  coincidence  (top)  and  anticoincidence  (bottom)  outputs,  b)  the 
bipolar  autocorrelation  and  c)  the  bipolar  autocorrelation  appearing  twice  amid 
cross  correlations  with  a  pseudorandom  bit  sequence. 

Figure  7.  Outputs  of  the  coincidence  (upper  trace)  and  anticoincidence  (lower 
trace)  detectors  during  the  autocorrelation  of  an  a)  130  nsec  b)  60  nsec  and 
c)  30  nsec  picket-fence  test  word.  Degradation  of  the  response  at  high  fre¬ 
quencies  is  evident. 
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Integrated  optical  circuits  for  numerical  computation* 

C.  M.  Verber  and  R.  P.  Kenan 
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Abstract 

Recent  developments  in  the  design  of  integrated  optical  circuits  for  performing  optical 
numerical  computations  are  discussed.  The  use  of  systolic  architectures  for  these  IOC's  is 
described  and  the  natural  marriage  of  IOC's  with  the  systolic  concept  is  discussed.  Examples 
include  optical  b  :  ary  correlation,  polynomial  evaluation,  and  matrix  multiplication. 

I .  Introduction 

There  has  recently  been  an  increasing  amount  of  interest  in  the  application  of  optical 
techniques  to  the  solution  of  a  variety  of  computational  problems.  The  reasons  most  com¬ 
monly  cited  for  this  interest  are  the  high  processing  speeds  and  the  low  power  consumption 
which  are  potential  characteristics  of  optical  analog  devices,  especially  if  the  problem  and 
the  algorithm  are  well  chosen.  We  discuss  here  the  possibilities  for  the  use  of  integrated 
optical  circuits  for  performing  several  specific  numerical  computations  and  discuss  one  ex¬ 
isting  device  and  suggest  several  others  which  are  designed  in  keeping  with  the  basic  archi¬ 
tectural  criteria  for  systolic  processors.  In  Section  II  we  review  these  criteria  and  dis¬ 
cuss  them  in  relationship  to  integrated  optics  technology. 

There  are  a  number  of  basic  integrated  optic  components  which  are  available  for  use  in 
computational  devices.  In  this  paper  we  limit  ourselves  to  planar  as  opposed  to  channelized 
IOCs  and  rely  heavily  upon  the  use  of  electrooptic  gratings  whose  properties  are  reviewed  in 
Section  III.  As  an  example  of  an  operational  device  we  describe  a  32-bit  digital  correlator 
which  operates  at  32  MBit/sec.  This  is  followed  by  several  suggestions  for  matrix  multipli¬ 
cation  and  polynomial  processors.  Among  the  problems  which  are  associated  with  these  opti¬ 
cal  devices  are  a  lack  of  dynamic  range  and  severe  nonlinearities.  Approaches  to  the  solu¬ 
tion  of  the  second  of  these  problems  are  presented  in  Section  VII. 

II.  Systolic  architectures  and  integrated  optical  circuits 

The  approach  to  computer  design  known  as  systolic  array  architecture  was  developed  by 
Kungl  and  others  as  a  method  of  approaching  the  problem  of  VLSI  computer  design.  The  basic 
guidelines  are: 

a.  Each  datum  should  be  fetched  from  memory  only  once  to  avoid  the  "von  Neumann 
bottleneck" . 

b.  Each  chip  should  contain  only  a  small  number  of  different  processor  subunits,  although 
these  subunits  may  be  repeated  many  times  on  each  chip. 

c.  Connections  between  subunits  should  be  only  to  nearest  neighbors  to  facilitate  the 
rapid  flow  of  data  and  to  simplify  fabrication. 

We  would  be  hard  pressed  to  compile  a  better  list  of  design  guidelines  for  integrated 
optical  circuits.  We  do  not  yet  have  available  an  optically  addressable  memory  for  IOCs, 
although  some  of  Nishihara's2  surface  holograms  may  be  adaptable  for  this  purpose.  It  is 
therefore  essential  that  the  recourse  to  memory  be  minimized  since  the  act  of  fetching  data 
from  a  digital  store  is  much  slower  than  the  rate  at  which  the  IOC  is  capable  of  using  that 
data.  Second,  at  this  stage  in  the  development  of  IOC  technology,  we  have  only  a  small  num¬ 
ber  of  operational  building  blocks  available  to  us.  The  second  guideline  is  therefore  com¬ 
patible  with  IOC  technology,  if  only  by  default.  The  third  guideline  is,  perhaps,  not  as 
important  for  optical  as  for  electronic  systems  since  it  is  possible  to  have  optical  carri¬ 
ers  intersect  in  either  planar  or  in  channel-*  configurations  without  causing  significant 
crosstalk.  Complex  interconnection  schemes  can  therefore  be  implemented  without  requiring 
a  multilayer  structure.  However,  since  the  progress  of  the  data  through  an  optical  proces¬ 
sor  is  controlled  by  the  speed  of  light  in  the  device  and  not  by  a  digital  clock,  it  will 
be  necessary  to  pay  attention  to  path  lengths  in  high-speed  devices  to  assure  that  proper 
synchronism  of  the  data  flow  is  maintained. 

There  are  several  obvious  advantages  to  using  integrated  as  opposed  to  bulk  optical  tech¬ 
niques  for  the  implementation  of  high-speed  computational  algorithms.  Perhaps  the  most  im¬ 
portant  is  the  fact  that  a  variety  of  high-speed  integrated-optical  modulators4  and  switches^ 
have  already  been  developed  and  that  these  require  electrical  drive  signals  which  are  sev¬ 
eral  orders  of  magnitude  less  than  comparable  bulk  components.  In  addition,  the  integrated 
systems  tend  to  be  more  compact  than  conventional  optical  systems  and  lend  themselves  to 
mass  production  by  more-or-less  conventional  photolithographic  techniques.  A  major 


B2 


shortcoming  of  the  IOCs  is  that  they  are  not  capable  of  the  same  flexibility  in  handling 
two-dimensional  computations  as  are  the  bulk  devices.  A  hybrid  approach  seems  to  be  the 
obvious  solution  to  this  problem. 

III.  Electrooptic  grating  structures 


The  devices  to  be  described  in  the  following  sections  rely  heavily  on  the  use  of  electro- 
optically-induced  gratings.  In  this  section  we  will  briefly  describe  the  generation  and  the 
operation  of  these  gratings. 


The  gratings  are  generated  via  the  electrooptic  effect  using  the  fringing  field  from  a 
set  of  interdigital  surface  electrodes.  The  basic  electrode  structure  is  illustrated  in 
Fig.  1.  The  electric  field  immediately  below  the  electrodes  is  normal  to  the  waveguide 

surface,  and  at  the  surface  in  the  gap  it  is  tangential  to  the  waveguide  surface.  Both  of 
these  fields  are  periodic  with  period  equal  to  four  line  widths  (if  the  line  and  gap  widths 
are  the  same) .  The  amplitudes  of  the  index  variations  induced  by  the  two  fields  are  not, 
however,  equal  because  they  generally  invoke  different  electrooptic  coefficients.  The  net 
effect  of  the  electrode  configuration  is  to  produce  a  complicated  index  profile.  The  fields, 
to  which  the  refractive  index  variations  are  proportional,  have  been  given  by  Engan®  in  a 
Fourier  series;  for  our  uses,  only  the  fundamental  component  is  important.  The  presence  of 
two  fields  causes  the  index  pattern  to  be  shifted  relative  to  the  electrode  structure,  that 
is,  the  maximum  of  the  index  modulation  does  not  occur  at  the  centers  of  the  gaps  or  of  the 
electrode  lines,  but  is  displaced  somewhat. 


The  induced  gratings  can  be  operated  at  high  efficiency,  if  desired,  using  low  voltages. 

A  typical  result  is  95%  efficiency  at  voltages  of  4-10  volts  for  a  grating  with  electrode 
lines  2  mm  long  and  period  8-15  ym.  The  diffraction  efficiency  of  a  grating  having  many 
fingers  appears  to  follow  Kogelnik's?  theory  in  form,  but  typically  does  not  reach  100% 
efficiency.  The  reason  for  this  may  be  the  incomplete  overlap  of  the  electric  field  with 
the  optical  field  because  of  the  exponential  decay  of  the  former  with  depth  into  the  wave¬ 
guide.  Finally,  we  mention  that  the  capacitance  of  the  surface  electrodes  on  y-cut  LiNb03 
is  about  .5  pf/mm  of  finger  length/finger  pair,  or  1  pf/finger  pair  for  2  mm  long  fingers. 

Electrooptic  gratings  are  capable  of  performing  simple  arithmetic  (logic)  operations  on 
analog  (binary)  voltage  signals.  The  simplest  such  operation  is  performed  using  the  basic 
element  pictured  in  Fig.  2.  The  diffracted  light  beam  has  intensity  equal  to  n  x  ID, 

and  n  is  determined  by  the  voltage  difference  between  the  two  electrodes.  For  binary  (two- 
level)  signals,  the  result  is  the  exclusive  OR  (EXOR)  logic  operation.  For  analog  voltages, 
the  result  is  a  nonlinear  function  of  the  voltages,  but  for  small  signals,  it  is  proportional 
to  | V^-V2 |2  the  square  of  the  voltage  difference.  The  problem  of  the  nonlinearity  in  the 
grating  response  is  discussed  in  Section  VI. 

To  multiply  two  signals  together,  we  use  the  "herringbone"  structure  shown  in  Fig.  3.  This 
is  essentially  two  grating-inducing  electrode  systems  using  slanted  fingers  and  placed  so 
that  the  output  of  the  first  is  the  input  to  the  second.  In  the  figure,  the  gratings  have 
been  drawn  to  share  one  electrical  lead,  the  ground,  but  this  is  not  required.  The  output 
here  is  the  input  intensity  multiplied  by  the  product  of  the  efficiencies  of  the  two  gratings. 
Again,  because  the  grating  response  is  nonlinear  in  the  voltages,  some  arrangement  must  be 
used  to  linearize  the  device. 


IV.  The  10  digital  correlator:  a  systolic  processor  of  design  F 

The  first  IOC  that  we  wish  to  discuss  is  an  optical  space-integrating  correlator®  for 
binary  words  that  was  developed  for  AFOSR.  In  this  correlator  a  set  of  32  electrodes  of  the 
type  illustrated  in  Fig.  1  are  arranged  in  a  line  to  form  a  means  for  spatially  modulating 
a  light  beam  in  a  planar  optical  waveguide.  This  array  of  gratings,  which  we  call  an  IOSLM 
(Integrated  Optical  Spatial  Light  Modulator),  is  activated  with  a  pattern  of  bits  correspond¬ 
ing  to  a  reference  word.  In  its  simplest  configuration ,8  a  binary  data  stream  is  used  to 
modulate  a  surface  acoustic  wave  (SAW)  having  the  same  period  as  the  electrooptic  grating 
and  traveling  parallel  to  it,  producing  a  traveling,  spatially-modulated,  acoustic  grating 
that  contains  the  signal  to  be  compared  to  the  reference  word.  The  full  width  of  the  elec¬ 
trooptic  grating  is  illuminated  by  a  guided  light  beam  incident  at  the  common  Bragg  angle 
of  the  two  gratings  (the  IOSLM  and  the  SAW) .  Light  that  is  undiffracted  and  light  that  is 
doubly  diffracted  pass  out  of  the  gratings  region  in  one  direction,  while  light  that  is  once 
diffracted  passes  in  a  second  direction.  Integration  of  the  output  light  in  the  first  direc¬ 
tion  by  a  lens  produces  a  time-dependent  electrical  signal,  upon  detection,  that  is  propor¬ 
tional  to  the  total  number  of  coincidences  of  "0-0"  and  "1-1"  bit  combinations  in  the  two 
data  sequences.  Similarly,  the  light  output  in  the  second  direction  can  be  integrated  to 
yield  an  electrical  signal  that  is  the  complement  of  the  first.  When  the  signal  consists 
simply  of  a  32-bit  word,  the  difference  between  these  two  signals  is  the  correlation  of  the 
two  words  considered  as  having  been  bipolar-encoded — that  is,  the  words  are  thought  of  as 
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being  represented  by  +1  and  -1  instead  of  as  +1  and  0  as  in  ordinary  binary  arithmetic;  this 
encoding  is  especially  useful  for  performing  recognition  operations. 

The  correlator  described  above  has  some  drawbacks,  including  contamination  of  the  output 
beam  in  the  "-1/-1  +  1/1"  direction  by  residual  and  scattered  incident  light  and  presence  of 
light  in  the  "1/-1  +  -1/i"  direction  even  in  the  absence  of  any  signal  wave.  These  defects 
can  be  corrected  by  using  two  acoustic  waves  to  carry  the  signal  and  a  nonparallel  geometry^. 
This  has  been  done’O,  but  will  not  be  further  discussed  here. 

We  want  to  point  out  here  that  this  correlator  is  an  example  of  a  systolic  processor  of 
"design  F"  as  described  by  Kungl .  This  means  that  the  weights  (here,  the  bits  of  the  refer¬ 
ence  word)  remain  fixed  in  place  while  the  data  (here,  the  bits  of  the  signal  word),  move 
and  the  output  is  collected  by  "fan-in"  (here,  the  integrating  lens  and  detectors).  The 
correlator  design  accurately  mimics  the  most  elementary  way  that  space-integrating  correla¬ 
tors  are  commonly  visualized  as  operating,  so  it  is  not  surprising  that  it  turns  out  to  be 
one  of  the  examples  of  Rung's  group  of  systolic  convolution  algorithms.  The  relevance  of 
its  systolic  nature  is  that  it  illustrates  an  advantage  of  integrated-optical  implementation 
systolic  architectures,  namely,  simple  fabrication.  The  device,  in  both  its  original  and 
its  improved  forms,  requires  only  one  photolithographic  step  to  fabricate.  Simple  fabrica¬ 
tion  is  an  advantage  that  will  occur  for  all  of  the  devices  that  will  be  discussed  in  this 
paper.  It  will  not  always  be  possible  to  get  by  with  only  one  photolithographic  step,  but 
none  will  require  more  than  two. 


V.  Matrix  multiplication 

It  was  shown  above  that  the  herringbone  structure  of  Fig.  3  could  be  used  to  perform 
analog  multiplication.  This  concept  can  be  simply  extended  to  compute  the  scalar  product  of 
two  vectors  as  shown  in  Fig.  2.  Here  the  herringbone  is  segmented,  each  segment  being  used 
to  generate  the  product  A^Bi.  The  products  are  then  summed  with  the  lens  to  generate  the 
scalar  product.  We  shall  now  show  how  this  structure  and  some  modifications  of  this  struc¬ 
ture  can  be  used  to  perform  vector-matrix  and  matrix-matrix  multiplication. 

It  is  possible  to  compute  the  product  of  a  matrix  and  a  vector  using  the  segmented  her¬ 
ringbone  structure  along  with  the  engagement  architecture  shown  in  Fig.  5.  Voltages  repre¬ 
senting  the  vector  components  and  the  matrix  elements  are  arranged  in  the  sequence  indicated 
in  the  figure  and  synchronously  stepped  through  the  engagement  region  which  is  simply  the 
segmented  herringbone  device.  The  successive  products  are  accumulated  on  integrating  photo¬ 
detectors  as  indicated.  A  schematic  of  an  IOC  for  accomplishing  this  is  shown  in  Fig.  6.  A 
major  problem  in  the  practical  implementation  of  this  technology  is  not  the  fabrication  of 
the  IOC,  but  in  the  design  of  a  suitable  electronic  drive  circuit  which  neither  unduly 
limits  the  speed  of  the  optical  device  nor  overwhelms  it  with  the  sheer  bulk  of  the  elec¬ 
tronic  hardware. 

A  systolic1 approach  to  matrix-matrix  multiplication  is  shown  schematically  in  Fig.  7.  The 
data  flow  through  the  engagement  region  as  indicated,  each  box  in  the  engagement  region 
being  a  device  which  performs  a  running  sum  of  the  products  of  the  respective  matrix  com¬ 
ponents  which  again  are  flowing  synchronously  through  the  device.  Note  that  in  order  to 
obtain  proper  registration  of  the  elements  of  the  two  matrices,  the  components  must  enter 
the  engagement  region  in  an  appropriately  skewed  array. 

A  schematic  of  an  integrated  optical  circuit  for  implementing  the  algorithm  of  Fig. 7  is 
shown  in  Fig.  8.  In  this  figure  the  herringbone  structure  has  been  disassembled.  A  uniform 
plane  guided  wave  is  incident  upon  b; j  modulator  units  where  it  has  the  appropriate  inten¬ 
sity  modulation  impressed  upon  it.  Tnis  information  is  then  carried  by  the  light  through  a 
series  of  beam  splitters  which  distribute  it  to  the  appropriate  a^j  modulators.  Since  the 
optical  distribution  of  information  is  essentially  instantaneous  compared  to  the  rate  at 
which  the  electronic  drive  circuitry  can  shift  voltages  through  the  system,  we  must  remove 
the  skew  from  the  A  matrix  element  array  to  maintain  proper  synchronism.  Once  again  it 
would  appear  that  the  major  challenge  in  the  fabrication  of  a  complete  matrix-matrix  multi¬ 
plier  using  these  concepts  will  be  in  the  design  of  high-speed,  compact  electronic  drive 
circuitry . 


VI.  Pipeline  processor  for  polynomial  evaluation 

Recently,  Verber  et  al11  proposed  an  optical  pipeline  processor  for  the  evaluation  of 
polynomials.  Their  systolic  architecture  for  performing  this  important  task  optically  was 
designed  initially  to  utilize  bulk  optical  components,  but  an  integrated-optical  implementa¬ 
tion  was  also  proposed.  It  is  this  design  that  we  discuss  here. 
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The  first  step  in  designing  an  optical  processor  for  polynomial  evaluation  is  to  rewrite 
the  polynomial  in  a  recursive  form,  using  synthetic  division: 

y  =  p(x>  =  aNxn  +  aN_^xn-^  +  ...  +  ajx  +  aQ 

=  ((...(a„x  +  aH_1)  +  a  2  > x  +*")  x  +al>  x  +  ao  • 

It  is  easily  seen  in  this  form  that  the  polynomial  can  be  evaluated  recursively  using  a 
simple  unit  that  multiplies  its  two  inputs  together  and  adds  a  constant  to  form  one  output 
and  passes  one  of  the  inputs  through  to  form  another  output,  as  shown  in  Fig.  9.  Chaining 
N  of  these  units  together  to  form  a  pipeline  will  then  form  an  evaluator  for  a  polynomial  of 
order  N. 

Implementation  of  this  architecture  as  an  integrated  optical  circuit  (IOC)  can  be  accomp¬ 
lished  using  an  electrooptic  grating  for  the  multiplier  and  an  ordinary  surface  grating  of, 
say,  AS2S3  for  an  adder.  Although  both  elements  are  gratings,  they  have  characteristics 
that  are  sufficiently  different  to  warrant  discussion. 

The  operation  of  simple  electrooptic  gratings  has  already  been  discussed  in  Section  III. 
The  relevant  feature  for  the  present  section  is  the  typical  large  period,  usually  larger 
than  3  urn.  This  means  that  these  gratings,  in  spite  of  their  depth,  are  not  very  selec¬ 

tive.  The  wavelength  selectivity  of  a  Bragg  grating  can  be  expressed  in  terms  of  the  angular 
selectivity  through 


AA1/2  =  2nAcos6Bi61y2 

with  £61/2  being  about  0.9  period/depth.  In  LiNb03,  a  grating  with  period  8  wm  and  depth 
2  mm  yields  AX  3/2  >1200  A.  Hence,  different  light  sources,  having  different  wavelengths, 
can  be  used  witn  assurance  that  the  multipliers  will  operate  properly. 

A  surface  grating  fabricated  holographically  in  a  suitable  material  like  AS2S3,  in  con¬ 
trast  to  the  electrooptic  case,  can  be  made  to  be  much  more  wavelength  selective  because  of 
the  very  small  periods  that  can  be  achieved.  At  a  wavelength  of  .83  vim,  a  surface  grating 
having  a  Bragg  angle  of  30  degrees  and  a  depth  of  1  mm  has  a  wavelength  selectivity  of  about 
6  A.  This  means  that  light  from  one  source  can  pass  through  the  surface  grating  without 
being  diffracted,  while  light  from  a  second  source  can  be  efficiently  diffracted  by  the  grat¬ 
ing,  so  that  addition  can  take  place  without  loss. 

With  these  remarks  in  mind,  we  can  consider  the  IOC  pictured  in  Fig.  10.  The  coefficients 
of  the  polynomial  are  entered  by  modulating  individual  light  sources,  so  N+l  light  sources 
are  needed.  These  sources  are  selected  to  have  wavelengths  differing  by  amounts  sufficient 
to  allow  their  light  to  pass  all  of  the  surface  gratings  save  the  one  through  which  they  are 
injected.  Each  source  has  its  own  collimating  waveguide  lens,  e.g.,  a  Luneburg  lens.  Each 
unit  of  the  processor  consists  of  an  electrooptic  multiplier  followed  by  a  surface-grating 
adder.  The  electrooptic  multipliers  are  actuated  by  the  voltage  x,  corresponding  to  the 
argument  at  which  the  polynomial  is  to  be  evaluated.  Since  a^  passes  through  all  multipliers, 
it  is  multiplied  by  x4;  a3  is  similarly  multiplied  by  x3;  etc.  The  relatively  coarse  electro- 
trooptic  gratings  operate  on  all  the  light  incident  on  them  because  of  the  close  spacing  of 
the  wavelengths.  In  contrast,  the  surface  gratings  act  only  on  the  light  of  the  wavelength 
of  the  source  that  they  are  injecting  into  the  pipeline.  The  x-values  propagate  down  the 
pipeline  from  left  to  right,  following  the  partially  assembled  polynomial.  Once  the  pipeline 
is  filled,  the  processor  will  output  one  evaluation  per  "pulse",  where  a  pulse  is  the  time 
required  for  light  to  traverse  one  unit  of  the  processor. 

Finally,  it  should  be  noted  that  the  processor  described  operates  with  light  intensities 
which  are  intrinsically  positive  and  real,  and  inserts  the  coefficients  by  modulation  of  a 
light  source,  so  the  a's  are  also  positive  and  real.  It  is,  however,  a  simple  matter  to 
stack  pipelines  in  parallel  to  implement  both  complex  a's  and  complex  x's,  including  com¬ 
ponents  of  either  sign.  Since  adding  both  signs  requires  doubling  the  number  of  pipelines 
and  implementing  complex  numbers  also  doubles  the  number  of  pipelines,  the  fully  complex  sys¬ 
tem  requires  16  pipelines,  with  associated  electronics.  Fig.  11  shows  eight  pipelines  con¬ 
figured  to  implement  complex  x  and  real  a,  with  the  additional  configuration  to  extend  to 
complex  a  indicated  below. 


VII.  The  linearization  problem 


Throughout  this  paper,  we  have  utilized  electrooptic  gratings  for  multiplication,  and  have 
indicated  some  of  their  advantages  in  this  role.  Here,  we  discuss  one  disadvantage  of  Bragg 

gratings,  namely,  their  inherent  nonlinearity,  caused  by  the  dependence  of  diffraction  effi- 

ciencyon  voltage.  The  preferred  approach,  of  course,  would  be  to  find  a  multiplier  element 

having  a  linear  voltage  response.  Alternative  solutions  are  discussed  below. 
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The  efficiency  of  an  electrooptic  grating  can  be  written,  at  Bragg  incidence,  as 

n  =  sin'aV 

where  a  is  a  constant.  This  nonlinear  response  means  that  some  method  must  be  found  to  pro¬ 
duce  a  voltage  from  the  input  variable  so  that  an  increment  in  the  input  variable  produces 
a  proportional  increment  in  n.  Let  x  denote  the  input  variable.  Then,  we  need  to  find  a 
voltage  V(x)  of  the  form 

V(x)  =  sin”^(/x)/a 

This  can  be  done  with  digital  electronics,  requiring  one  circuit  for  each  electrooptic  grat¬ 
ing.  Alternatively,  some  ac  signal  processing  could  be  used,  but  this  becomes  more  and 
more  complicated  as  the  order  of  the  polynomials  increases.  The  simplest  solution  will  prob¬ 
ably  be  to  use  an  analog  electronic  circuit  to  extract  the  square  root  of  x,  and  adjust  the 
operating  voltages  so  that  one  remains  in  the  small  signal  regime.  In  this  case, 

n  =  (aV)2  =  x 

This  keeps  the  circuitry  simple,  although  it  leads  to  a  loss  of  signal-to-noise  ratio.  If 
noise  becomes  a  problem,  as  it  well  may  in  large-order  polynomials,  then  the  full  arcsine 
function  must  be  used. 


Conclusions 

In  this  paper,  we  have  reviewed  several  kinds  of  systolic  architectures  that  can  be  used 
in  an  integrated-optical  circuit  to  perform  numerical  computations  ranging  from  simple  logic 
operations  to  polynomial  evaluation  to  matrix  operations.  All  of  the  devices  reviewed  uti¬ 
lize  electrooptically-induced  gratings  in  an  electrooptic  waveguide.  There  are,  of  course, 
other  ways  to  perform  some  of  these  operations,  including  surface  acoustic  waves;  and  there 
are  surely  many  other  numerical  computations  that  can  be  performed  optically  using  integrated 
optics.  It  is  hoped  that  this  review  will  stimulate  others  to  join  in  the  search  for  new 
applications  in  this  exciting  area. 
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Fig.  1.  The  basic  electrode  structure  for 
inducing  electrooptic  gratings, 
showing  the  electrode  parameters. 
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Fig.  2.  Schematic  of  the  use  of  an  induced 
grating  for  subtraction  (or  logical 
EXOR) . 


Fig.  5.  Illustration  of  the  engagement  architecture 
for  vector-matrix  multiplication. 
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Fig.  6.  Integrated-optical  realization  of  the 
architecture  of  Fig.  5. 
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Fig.  7.  Illustration  of  a  systolic  architecture 
for  matrix-matrix  multiplication. 
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Fig.  8.  Schematic  layout  for  an  integrated- 
optical  realization  of  the  architec¬ 
ture  of  Fig.  7. 
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Fig.  9.  The  basic  calculation  module.  Pn+1 

is  the  partially  assembled  polynomial 
from  the  previous  stage.  pn(x)  is  the 
output  from  the  present  stage. 
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Fig.  10.  An  integrated-optical  implementation 
of  the  pipeline  processor,  utilizing 
electrooptic  grating  modulators  and 
surface  grating  adders. 
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Fig.  11.  Illustration  of  the  problem-division 
technique.  Top:  divisions  needed  to 
accommodate  complex,  positive  a^  and 
complex  x  of  either  sign.  Bottom: 
further  division  to  accommodate 
either  sign  for  a^. 
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